-
Notifications
You must be signed in to change notification settings - Fork 257
Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2149
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 5 New FailuresAs of commit e64cf1f with merge base 137b079 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
def choose_qparams_affine_tiny_gemm( | ||
input: torch.Tensor, | ||
mapping_type: MappingType, | ||
block_size: Tuple[int, ...], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: change this to Tuple[int]
as well to be consistent, assuming it means the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Block size is tuple with multiple integers, hence will need to do Tuple[int, ...]
target_dtype: torch.dtype, | ||
quant_min: Optional[Union[int, float]] = None, | ||
quant_max: Optional[Union[int, float]] = None, | ||
eps: Optional[float] = None, | ||
scale_dtype: Optional[torch.dtype] = None, | ||
zero_point_dtype: Optional[torch.dtype] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could probably simplify this list as well, only configurable things are needed, this can be a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
target_dtype: torch.dtype, | ||
quant_min: Optional[Union[int, float, bool]] = None, | ||
quant_max: Optional[Union[int, float, bool]] = None, | ||
eps: Optional[float] = None, | ||
scale_dtype: Optional[torch.dtype] = None, | ||
zero_point_dtype: Optional[torch.dtype] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
MappingType.SYMMETRIC.name, | ||
MappingType.SYMMETRIC_NO_CLIPPING_ERR.name, | ||
MappingType.ASYMMETRIC.name, | ||
MappingType.SYMMETRIC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this op has to be lowered, we'd need to use str
instead of enum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the new ops I've used MappingType enum. Should I update them to str
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just making sure the default one (ZeroPointDomain=INT and preserve_zero=True) can be lowered is fine for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
choose_qparams_affine_with_min_max
might be lowered as well I think, but you can add a TODO and worry about this a bit later as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
@@ -301,12 +305,6 @@ def quantize_affine( | |||
output_dtype (torch.dtype): requested dtype (e.g. torch.uint8) for output Tensor | |||
quant_min (Optional[int]): minimum quantized value for output Tensor, if not specified, it will be derived from dtype | |||
quant_max (Optional[int]): maximum quantized value for output Tensor, if not specified, it will be derived from dtype | |||
zero_point_domain (ZeroPointDomain): the domain that zero_point is in, should be either integer or float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably preserve these for now, and move to quant_api, same for the doc for peserve_zero
arg
No description provided.